Arabic characters in Unicode

Arabic alphabet
ا    ب    ت    ث    ج    ح
خ    د    ذ    ر    ز    س
ش    ص    ض    ط    ظ    ع
غ    ف    ق    ك    ل
م    ن    ه    و    ي
History · Transliteration
Diacritics · Hamza ء
Numerals · Numeration
Persian alphabet
        پ                 چ
                        ژ
                     
                ک    گ
                ه    ی

As of Unicode 6.0, the following blocks encode Arabic characters:

The basic Arabic range encodes the standard letters and diacritics, but does not encode contextual forms (U+0621–U+0652 being directly based on ISO 8859-6); and also includes the most common diacritics and Arabic-Indic digits. The Arabic Supplement range encodes letter variants mostly used for writing African (non-Arabic) languages. The Arabic Presentation Forms-A range encodes contextual forms and ligatures of letter variants needed for Persian, Urdu, Sindhi and Central Asian languages. The Arabic Presentation Forms-B range encodes spacing forms of Arabic diacritics, and more contextual letter forms. The presentation forms are present only for compatibility with older standards, and are not currently needed for coding text[2].

Contents

Contextual forms

A demonstration for the basic alphabet used in Modern Standard Arabic:

General
Unicode
Contextual forms Name
Isolated End Middle Beginning
0627
أ
FE8D
أ
FE8E
ـأ
ʾalif
0628
ب
FE8F
FE90
ـب
FE92
ـبـ
FE91
بـ
bāʾ
062A
ت
FE95
FE96
ـت
FE98
ـتـ
FE97
تـ
tāʾ
062B
ث
FE99
FE9A
ـث
FE9C
ـثـ
FE9B
ثـ
ṯāʾ
062C
ج
FE9D
FE9E
ـج
FEA0
ـجـ
FE9F
جـ
ǧīm
062D
ح
FEA1
FEA2
ـح
FEA4
ـحـ
FEA3
حـ
ḥāʾ
062E
خ
FEA5
FEA6
ـخ
FEA8
ـخـ
FEA7
خـ
ḫāʾ
062F
د
FEA9
FEAA
ـد
dāl
0630
ذ
FEAB
FEAC
ـذ
ḏāl
0631
ر
FEAD
FEAE
ـر
rāʾ
0632
ز
FEAF
FEB0
ـز
zayn/zāy
0633
س
FEB1
FEB2
ـس
FEB4
ـسـ
FEB3
سـ
sīn
0634
ش
FEB5
FEB6
ـش
FEB8
ـشـ
FEB7
شـ
šīn
0635
ص
FEB9
FEBA
ـص
FEBC
ـصـ
FEBB
صـ
ṣād
0636
ض
FEBD
FEBE
ـض
FEC0
ـضـ
FEBF
ضـ
ḍād
0637
ط
FEC1
FEC2
ـط
FEC4
ـطـ
FEC3
طـ
ṭāʾ
0638
ظ
FEC5
FEC6
ـظ
FEC8
ـظـ
FEC7
ظـ
ẓāʾ
0639
ع
FEC9
FECA
ـع
FECC
ـعـ
FECB
عـ
ʿayn
063A
غ
FECD
FECE
ـغ
FED0
ـغـ
FECF
غـ
ġayn
0641
ف
FED1
ف
FED2
ـف
FED4
ـفـ
FED3
فـ
fāʾ
0642
ق
FED5
FED6
ـق
FED8
ـقـ
FED7
قـ
qāf
0643
ك
FED9
FEDA
ـك
FEDC
ـكـ
FEDB
كـ
kāf
0644
ل
FEDD
FEDE
ـل
FEE0
ـلـ
FEDF
لـ
lām
0645
م
FEE1
FEE2
ـم
FEE4
ـمـ
FEE3
مـ
mīm
0646
ن
FEE5
ن
FEE6
ـن
FEE8
ـنـ
FEE7
نـ
nūn
0647
FEE9
FEEA
ـه
FEEC
ـهـ
FEEB
هـ
hāʾ
0648
و
FEED
FEEE
ـو
wāw
064A
ي
FEF1
FEF2
ـي
FEF4
ـيـ
FEF3
يـ
yāʾ
0622
آ
FE81
FE82
ـآ
ʾalif maddah
0629
ة
FE93
FE94
ـة
Tāʾ marbūṭah
0649
ى
FEEF
FEF0
ـى
ʾalif maqṣūrah

Punctuation and ornaments

Only the Arabic comma is used in regular Arabic typing, which can also be substituted with the normal comma used in Latin-based scripts at U+002c.

Word ligatures

Arabic Presentation Forms-A has a few characters defined as "word ligatures" for terms frequently used in formulaic expressions in Arabic. They are rarely used out of professional liturgical typing, also the Rial grapheme is normally written fully, not by the ligature.

Code blocks

Arabic

Arabic[1]
Unicode.org chart (PDF)
  0 1 2 3 4 5 6 7 8 9 A B C D E F
U+060x ؀ ؁ ؂ ؃ ؆ ؇ ؈ ؉ ؊ ؋ ، ؍ ؎ ؏
U+061x ؐ ؑ ؒ ؓ ؔ ؕ ؖ ؗ ؘ ؙ ؚ ؛ ؞ ؟
U+062x ؠ ء آ أ ؤ إ ئ ا ب ة ت ث ج ح خ د
U+063x ذ ر ز س ش ص ض ط ظ ع غ ػ ؼ ؽ ؾ ؿ
U+064x ـ ف ق ك ل م ن ه و ى ي ً ٌ ٍ َ ُ
U+065x ِ ّ ْ ٓ ٔ ٕ ٖ ٗ ٘ ٙ ٚ ٛ ٜ ٝ ٞ ٟ
U+066x ٠ ١ ٢ ٣ ٤ ٥ ٦ ٧ ٨ ٩ ٪ ٫ ٬ ٭ ٮ ٯ
U+067x ٰ ٱ ٲ ٳ ٴ ٵ ٶ ٷ ٸ ٹ ٺ ٻ ټ ٽ پ ٿ
U+068x ڀ ځ ڂ ڃ ڄ څ چ ڇ ڈ ډ ڊ ڋ ڌ ڍ ڎ ڏ
U+069x ڐ ڑ ڒ ړ ڔ ڕ ږ ڗ ژ ڙ ښ ڛ ڜ ڝ ڞ ڟ
U+06Ax ڠ ڡ ڢ ڣ ڤ ڥ ڦ ڧ ڨ ک ڪ ګ ڬ ڭ ڮ گ
U+06Bx ڰ ڱ ڲ ڳ ڴ ڵ ڶ ڷ ڸ ڹ ں ڻ ڼ ڽ ھ ڿ
U+06Cx ۀ ہ ۂ ۃ ۄ ۅ ۆ ۇ ۈ ۉ ۊ ۋ ی ۍ ێ ۏ
U+06Dx ې ۑ ے ۓ ۔ ە ۖ ۗ ۘ ۙ ۚ ۛ ۜ ۝ ۞ ۟
U+06Ex ۠ ۡ ۢ ۣ ۤ ۥ ۦ ۧ ۨ ۩ ۪ ۫ ۬ ۭ ۮ ۯ
U+06Fx ۰ ۱ ۲ ۳ ۴ ۵ ۶ ۷ ۸ ۹ ۺ ۻ ۼ ۽ ۾ ۿ
Notes
1.^ As of Unicode version 6.0

Arabic Supplement

Arabic Supplement[1]
Unicode.org chart (PDF)
  0 1 2 3 4 5 6 7 8 9 A B C D E F
U+075x ݐ ݑ ݒ ݓ ݔ ݕ ݖ ݗ ݘ ݙ ݚ ݛ ݜ ݝ ݞ ݟ
U+076x ݠ ݡ ݢ ݣ ݤ ݥ ݦ ݧ ݨ ݩ ݪ ݫ ݬ ݭ ݮ ݯ
U+077x ݰ ݱ ݲ ݳ ݴ ݵ ݶ ݷ ݸ ݹ ݺ ݻ ݼ ݽ ݾ ݿ
Notes
1.^ As of Unicode version 6.0

Arabic Presentation Forms A

They are mostly ligatures which can be created by the previous charts' characters, with the exception of the bracket-like graphemes ﴾ ﴿ and the ligatures of common liturgical phrases.

Arabic Presentation Forms-A[1]
Unicode.org chart (PDF)
  0 1 2 3 4 5 6 7 8 9 A B C D E F
U+FB5x
U+FB6x
U+FB7x ﭿ
U+FB8x
U+FB9x
U+FBAx
U+FBBx ﮿
U+FBCx
U+FBDx
U+FBEx
U+FBFx ﯿ
U+FC0x
U+FC1x
U+FC2x
U+FC3x ﰿ
U+FC4x
U+FC5x
U+FC6x
U+FC7x ﱿ
U+FC8x
U+FC9x
U+FCAx
U+FCBx ﲿ
U+FCCx
U+FCDx
U+FCEx
U+FCFx ﳿ
U+FD0x
U+FD1x
U+FD2x
U+FD3x ﴿
U+FD4x
U+FD5x
U+FD6x
U+FD7x ﵿ
U+FD8x
U+FD9x
U+FDAx
U+FDBx ﶿ
U+FDCx
U+FDDx
U+FDEx
U+FDFx
Notes
1.^ As of Unicode version 6.0

Arabic Presentation Forms B

They can all be created by the basic chart's characters.

Arabic Presentation Forms-B[1]
Unicode.org chart (PDF)
  0 1 2 3 4 5 6 7 8 9 A B C D E F
U+FE7x ﹿ
U+FE8x
U+FE9x
U+FEAx
U+FEBx ﺿ
U+FECx
U+FEDx
U+FEEx
U+FEFx 
Notes
1.^ As of Unicode version 6.0

Rumi Numeral Symbols

Rumi Numeral Symbols[1]
Unicode.org chart (PDF)
  0 1 2 3 4 5 6 7 8 9 A B C D E F
U+10E6x 𐹠 𐹡 𐹢 𐹣 𐹤 𐹥 𐹦 𐹧 𐹨 𐹩 𐹪 𐹫 𐹬 𐹭 𐹮 𐹯
U+10E7x 𐹰 𐹱 𐹲 𐹳 𐹴 𐹵 𐹶 𐹷 𐹸 𐹹 𐹺 𐹻 𐹼 𐹽 𐹾
Notes
1.^ As of Unicode version 6.0

References

  1. ^ Unicode v6.0 (UAX#41): Scripts
  2. ^ The Unicode Consortium. The Unicode Standard, Version 6.0.0, (Mountain View, CA: The Unicode Consortium, 2011. ISBN 978-1-936213-01-6), Chapter 8

External links